2026-01-01
rstanarm packageEstimate the mean weight of students’ backpacks at UVA.
It’s just an estimate. If I take another sample, I’ll get a different estimate. Say 19.1. Or 16.8. I want to understand the uncertainty in my estimate.
Assume there is some fixed but unknown mean backpack weight. If we could see all and know all, we could estimate the “true” mean. We want to estimate that true mean.
If we were to repeat this process many, many times, 95% of the confidence intervals would contain the true value. (NOTE: A confidence interval is NOT a probability interval.)
This approach is sometimes called the Frequentist approach.
Instead of estimating some “true” mean, we instead estimate the probability distribution of possible means.
The “estimate” is the entire posterior distribution.
Prior to our analysis, perhaps we think the average backpack weighs around 18 lbs1. A possible prior distribution might be a \(N(18, 5)\).
After observing our data we use Bayes’ theorem to update our prior distribution to get a posterior distribution.
There’s about 74% probability that the mean backpack weight is between 15.5 and 16.5 lbs.
Examine two brands of rechargeable batteries. How long do they run (in hours) before exhausted? Is one brand superior?
If I do this again, will I get the same basic result? Or will the result substantially change? How certain are we about the 0.86 estimate?
Assume there is some fixed but unknown difference in means. If we had access to all batteries and unlimited time, we could estimate the true difference in means.
If we were to repeat this process many, many times, 95% of the confidence intervals would contain the true value. (Reminder: A CI is not a probability interval.)
We might also do a t-test to estimate the probability of getting a difference as large (or larger) if the true difference was 0 (i.e., a p-value).
Instead of estimating some “true” value, we instead estimate the probability distribution of possible differences in means.
Perhaps we’re not sure which brand is better. A possible prior distribution might look like this, a \(N(0,3)\) distribution.
After observing our data we use Bayes’ theorem to update our prior distribution to get a posterior distribution.
There’s about a 95% probability the mean difference is between 0.4 and 1.3.
We use Bayes’ theorem to update our prior distribution to get a posterior distribution.
\[\text{posterior} = \frac{\text{likelihood} \times \text{prior}}{\text{average likelihood}} \]
rstanarm packagerstanarm packagerstanarm generates Stan code for us, and then uses the rstan package to run the modelEstimate the mean and calculate a confidence interval using t.test
We can also analyze the backpack data using an intercept-only linear model:
The intercept-only model:
\[\text{height}_i \sim N(\mu, \sigma)\]
Where \(\mu\) is the intercept and \(\sigma\) is the residual standard error.
This is an important concept because doing Bayesian statistics requires us to frame our analysis as a model.
Call:
lm(formula = backpacks ~ 1, data = dat)
Residuals:
Min 1Q Median 3Q Max
-9.050 -2.100 -0.100 1.475 24.950
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.0500 0.4331 37.06 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.331 on 99 degrees of freedom
The intercept-only model estimates a mean of 16.05 and a standard deviation of 4.331.
We can also extract a confidence interval from the intercept-only model:
This matches what we get with the t.test() function.
In addition to lm, we can use glm with family = gaussian. GLM = Generalized Linear Model.
2.5 % 97.5 %
15.2012 16.8988
Same models as before:
\[\text{height}_i \sim N(\mu, \sigma)\]
“Gaussian” is another name for “normal”. Need to specify Gaussian because a GLM can fit many different families of models (eg, Poisson, Binomial, Gamma)
Using rstanarm, we can perform a Bayesian analysis using code that is almost identical to the analysis performed with glm().
Instead of a confidence interval we obtain a Bayesian uncertainty interval:
Calculate a confidence interval using t.test()
Welch Two Sample t-test
data: y by grp
t = -3.6472, df = 44.32, p-value = 0.000694
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-1.3406537 -0.3864785
sample estimates:
mean in group 0 mean in group 1
10.16867 11.03223
Can also analyze the data using a linear model.
The model in this case:
\[\text{battery life}_i \sim N(\mu = \alpha + \beta\text{ grp}, \sigma)\]
where \(\alpha\) is mean of group 0, \(\beta\) is the difference in means between group 1 and group 0, and \(\sigma\) is the pooled standard deviation.
Call:
lm(formula = y ~ grp, data = bat)
Residuals:
Min 1Q Median 3Q Max
-2.38337 -0.44205 -0.03553 0.60252 1.42662
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.1687 0.1674 60.736 < 2e-16 ***
grp 0.8636 0.2368 3.647 0.000652 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.8371 on 48 degrees of freedom
Multiple R-squared: 0.217, Adjusted R-squared: 0.2007
F-statistic: 13.3 on 1 and 48 DF, p-value: 0.0006516
The mean for group 0 is 10.1687 and the difference in means is 0.8636. The pooled standard deviation is 0.8371.
We can also extract a confidence interval from this model:
This basically matches what we get with the t.test() function, with a change in sign.
The linear model estimates group 1 - group 0, while the t-test estimated group 0 - group 1.
The slight differences are due to the t.test() function performing Welch’s t-test by default, which allows for unequal variances between groups.
Once again we can use glm() with family = gaussian.
Again, we can transition smoothly to a Bayesian analysis using rstanarm. The syntax is practically the same with an argument for the prior.
Instead of a confidence interval we obtain a Bayesian uncertainty interval:
rstanarmrstanarm provides weakly informative priors by default. (ie, rules out extreme positive or negative values)stan_lm vs stan_glmprior_summary function on the model objectLet’s go to the R script and work with our examples.
When doing Bayesian data analysis with MCMC sampling, we want to confirm that the sample we obtained describes the posterior.
We should do this before interpreting results and making inferences.
There are several diagnostics that can help with this.
Thankfully rstanarm provides warnings when sampling is suspect.
rstanarm runs 4 chains by default. Can think of these as 4 expeditions to explore the posterior.We hope to see the chains indistinguishable from one another
Rhat)summary output of the model in the “MCMC diagnostics” sectionMarkov chains did not converge! Do not analyze results!”iter argument.n_eff)n_eff) corrects for any autocorrelation and approximates the number of independent samples we haven_eff > 1000summary output of the model in the “MCMC diagnostics” sectionn_eff is the approximate number of independent samplesmcse)summary output of the model in the “MCMC diagnostics” sectionpp_check(stan_mod)Let’s return to the R script.
Once we have assessed the sampling quality and model fit, we’re ready to summarize and interpret the posterior distributions.
Common summaries include:
summary of a model fit with rstanarmsummary output in the 10% and 90% columnsprobs argument to request different intervals. For example, to request a 95% interval for model “m”: summary(m, probs = c(0.025, 0.975))summary outputThings to keep in mind:
rstanarm is 4000 (4 chains at 2000 iterations each, with first 1000 as a warm-up)summary output in rstanarm summarizes these samples from the posterior distributionsWe can create our own summaries of the posterior distributions.
The as.matrix and as.data.frame functions extract the posterior samples into a matrix and data frame, respectively.
Let’s say we want to estimate the probability that the mean backpack weight is greater than 16 lbs. Notice we specify the (Intercept) parameter with the $ operator. That’s because there are two parameters to choose from (Intercept and sigma)
Let’s say we want to estimate the probability that the difference in battery life is greater than 1 hour. Notice we specify the grp parameter with the $ operator. That’s because there are three parameters to choose from (Intercept, grp, and sigma)
Let’s say we want to create 95% credibility intervals for the means of the two brands of batteries. The Intercept is the estimated mean of grp 0. The sum of the Intercept and grp coefficient is the estimated mean of grp 1.
2.5% 97.5%
9.82601 10.49629
2.5% 97.5%
10.69355 11.35564
Let’s return to the R script.
rstanarm modelsrstanarm models can be diagnosed and explored with the ShinyStan web applaunch_shinystan(mod) where mod is your rstanarm modelrstanarmEspe, M. et al. (2016) Yield gap analysis of US rice production systems shows opportunities for improvement. Field Crops Research, 196:276-283.
Kubrak, O. et al. (2017) Adaptation to fluctuating environments in a selection experiment with Drosophila melanogaster. Ecology and Evolution, 7:3796-3807.
Herzog, S. et al. (2017) Sun Protection Factor Communication of Sunscreen Effectiveness. JAMA Dermatology, 153(3):348-350.
Kovic, M. and Hänsli, N. (2017) The impact of political cleavages, religiosity, and values on attitudes towards nonprofit organizations. Social Sciences, 7(1), 2.
Betancourt, Michael. (2018) A Conceptual Introduction to Hamiltonian Monte Carlo. https://arxiv.org/pdf/1701.02434.pdf
Gelman, A., Hill, J, & Vehtari, A. (2020) Regression and Other Stories. Cambridge University Press.
McElreath, M. (2020). Statistical Rethinking, 2nd Ed. CRC Press. Boca Raton.
Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99-119. doi:10.20982/tqmp.14.2.p099
rstanarm web site: http://mc-stan.org/rstanarm/
Theoretical Ecology. (2010, September 17). Metropolis-Hastings MCMC in R. https://theoreticalecology.wordpress.com/2010/09/17/metropolis-hastings-mcmc-in-r/
For free statistical consulting, contact us: statlab@virginia.edu
Sign up for more workshops or see past workshops:
https://library.virginia.edu/data/training
Register for the Research Data Services newsletter to be notified of new workshops:
https://library.virginia.edu/data/newsletters